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This paper extends the theory of false discovery rates (FDR) pi- 
fvj ■ oneered by Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 

(1995) 289-300]. We develop a framework in which the False Dis- 
covery Proportion (FDP) — the number of false rejections divided by 
U~^ • the number of rejections — is treated as a stochastic process. After 

C/j ' obtaining the limiting distribution of the process, we demonstrate 

(-H , the validity of a class of procedures for controlling the False Discov- 

ery Rate (the expected FDP). We construct a confidence envelope 
for the whole FDP process. From these envelopes we derive confi- 
dence thresholds, for controlling the quantiles of the distribution of 
the FDP as well as controlling the number of false discoveries. We 
also investigate methods for estimating the p-value distribution. 
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2 C. GENOVESE AND L. WASSERMAN 

5. Asymptotic validity of plug-in procedures 

6. Confidence envelopes for FDP 

6.1. Asymptotic confidence envelope 

6.2. Exact confidence envelope 

6.3. Examples 
Appendix 

Notation index 

The following summarizes the most common recurring notation and indi- 
cates where each symbol is defined. 
Symbol Description 

m Total number of tests performed 

P™ Vector of p- values (Pi , . . . , P,„) 
ffm Vector of hypothesis indicators {Hi, . . . , Hm) 
P(j) The ith smallest p-value; P(o) = 
Mq Number of true null hypotheses 
Ml Number of false null hypotheses 
a Probability of a false null 

F, f Alternative p-value distribution (CDF, PDF) 

G, g Marginal distribution (CDF, PDF) of the Pi's 
G Generic estimator of G 

Gm Empirical CDF of P™ 

U Uniform CDF 

F FDP process 

H FNP process 

£m Dvoretzky-Kiefer-Wolfowitz nghd. radius 

Q Asymptotic mean of F 

Q Asymptotic mean of H 

We use 1{. . .} and P{. . .} to denote, respectively, the indicator and proba- 
bility of the event {•■•}; subscripts on P specify the underlying distributions 
when necessary. We also use E to denote expectation, and X^a ^^ X to de- 
note that Xm converges in distribution to X. We use Za to denote the upper 
a-quantile of a standard normal. 

1. Introduction. Among the many challenges raised by the analysis of 
large data sets is the problem of multiple testing. In some settings it is not 
unusual to test thousands or even millions of hypotheses. Examples include 
function magnetic resonance imaging, microarray analysis in genetics and 
source detection in astronomy. Traditional methods that provide strong con- 
trol of familywise error often have low power and can be unduly conservative 
in many applications. 

Benjamini and Hochberg (BH) (1995, 2000) pioneered an alternative. De- 
fine the False Discovery Proportion (FDP) to be the number of false rejec- 
tions divided by the number of rejections. The False Discovery Rate (FDR) 



Section 


Page 


2.1 


4 


2.2 


4 


2.2 


4 


2.2 


4 


2.2 


5 


2.2 


5 


2.2 


4 


2.2 


5 


2.2 


5 


3 


8 


3 


8 


2.2 


5 


2.5 


7 


2.5 


7 


3 


8 


2.5 


7 


2.5 


7 



FALSE DISCOVERY RATES 3 

is the expected FDP. BH (1995) provided a distribution-free, finite sample 
method for choosing a p-value threshold that guarantees that the FDR is 
less than a target level a. The same paper demonstrated that the BH proce- 
dure is often more powerful than traditional methods that control familywise 
error. 

Recently there has been much further work on FDR. We shall not attempt 
a complete review here but mention the following. Benjamini and Yekutieli 
(2001) extended the BH method to a class of dependent tests. Efron, Tib- 
shirani, Storey and Tusher (2001) developed an empirical Bayes approach to 
multiple testing and made interesting connections with FDR. Storey (2002, 
2003) connected the FDR concept with a certain Bayesian quantity and 
proposed a new FDR method which has higher power than the original BH 
method. Finner and Roters (2002) discussed the behavior of the expected 
number of type I errors. Sarkar (2002) considered a general class of stepwise 
multiple testing methods. 

Genovese and Wasserman (2002) showed that, asymptotically, the BH 
method corresponds to a fixed threshold method that rejects all p-values 
less than a threshold n*, and they characterized u* . They also introduced 
the False Nondiscovery Rate (FNR) and found the optimal threshold t* in 
the sense of minimizing FNR subject to a bound on FDR. The two thresholds 
are related by w* < t* , implying that BH is (asymptotically) conservative. 
Abramovich, Benjamini, Donoho and Johnstone (2000) established a con- 
nection between FDR and minimax point estimation. (An interesting open 
question is whether the asymptotic results obtained in this paper can be ex- 
tended to the sparse regime in the aforementioned paper where the fraction 
of alternatives tends to zero.) 

In this paper we develop some large-sample theory for FDRs and present 
new methods for controlling quantiles of the false discovery distribution. An 
essential idea is to view the proportion of false discoveries as a stochastic 
process indexed by the p- value threshold. The problem of choosing a thresh- 
old then becomes a problem of controlling a stochastic process. Although 
this stochastic process is not observable, we will show that it is amenable to 
inference. 

The main contributions of the paper include the following: 

1. Development of a stochastic process framework for FDP. 



Ho Not Rejected Ha Rejected Total 

Ho True Mo|o A/i|o Mo 

Hq False Mo|i Mx\\ Mi 

Total m — R R m 
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2. Investigation of estimators of the p- value distribution, even in the non- 
identifiable case. 

3. Proof of the asymptotic validity of a class of methods for FDR control. 

4. Two methods for constructing confidence envelopes for the False Discov- 
ery process and the number of false discoveries. 

5. New methods, which we call confidence thresholds, for controlling quan- 
tiles of the false discovery distribution. 

2. Preliminaries. 

2.1. Notation. Consider a multiple testing situation in which m tests 
are being performed. Suppose Mq of the null hypotheses are true and Mi = 
m — Mq null hypotheses are false. We can categorize the m tests in the 
following 2x2 table on whether each null hypothesis is rejected and whether 
each null hypothesis is true: 
We define the FDP and the FNP by 

(1) FDP=<|-^' if^>0, 




and 



Mo|i 



(2) FNP= ;^^37^' ^iR<m, 

[O, ifR = m. 

The first is the proportion of rejections that are incorrect, and the second — 
the dual quantity — is the proportion of nonrejections that are incorrect. 
Notice that FDR = E(FDP), and following Genovese and Wasserman (2002), 
we define FNR= E(FNP). Storey (2002) considered a different definition of 
FDR, called pFDR for positive FDR, by conditioning on the event that 
R> and discussed the advantages and disadvantages of this definition. 

2.2. Model. Let Hi = (or 1) if the ith null hypothesis is true (false) 
and let Pi denote the ith p- value. Define vectors P™" = {Pi, . . . ,Pm) and 
H"^ = {Hi, . . . , Hm). Let P(i) < • • • < P{m) denote the ordered p- values, and 
define P(o) = 0- 

In this paper we use a random effects (or hierarchical) model as in Efron, 
Tibshirani, Storey and Tusher (2001). Specifically, we assume the following 
for 0<a<l: 





Hi,..., Hrn ~ Bernouni(a), 




Si, . . .,Hm ~ Cjr, 


pm -- 


= 0, Hi=^i~Uniform(0,l), 


Pi\H^ -- 


= 1, E,=^ir^Ci, 
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where Hi, ... , H^ denote distribution functions and £jr is an arbitrary prob- 
ability measure over a class of distribution functions T that stochastically 
dominates the Uniform(0, 1). 

It follows that the marginal distribution of the j5- values is 

(3) G=(l-a)[/ + aF, 

where V(t) denotes the Uniform(0,l) CDF and F{t) = J ^{t)dC3r{^). Note 
that G >U. Except where noted we assume that G is strictly concave with 
density g = G' . 

Remark 2.1. A more common approach in multiple testing is to use 
a conditional model in which Hi, . . . ,Hm are fixed, unknown binary values. 
The results in this paper can be cast in a conditional framework but we find 
the random effects framework to be more intuitive. 

Define Mq = J2ii^ ~ H-i) and Mi = "^^ Hi. Hence, Mq ~ Binomial(m, 1 — a) 
and Ml =m-Mo. 

2.3. The Benjamini-Hochberg and plug-in methods. The Benjamini-Hoch- 
berg (BH) procedure is a distribution-free method for choosing which null 
hypotheses to reject while guaranteeing that FDR < a for some preselected 
level a. The procedure rejects all null hypotheses for which Pj < P(;^gjj), 
where 

(4) Rbh = maxi < z < ?n : P(j) < 
BH (1995) proved that this procedure guarantees 

(5) E(FDP|Mo)<— a<a, 

m 

regardless of how many nulls are true and regardless of the distribution 
of the p- values under the alternatives. (When the p- value distribution is 
continuous, BH shows that the first inequality is an equality.) In the context 
of our model, this result becomes 

(6) FDR<(l-a)a<a. 

Genovese and Wasserman (2002) showed that, asymptotically, the BH 
procedure corresponds to rejecting the null when the p- value is less than u* , 
where u* is the solution to the equation G{u) = u/a, in the notation of the 
current paper. This u* satisfies a/m <u*<a for large m, which shows that 
the BH method is intermediate between Bonferroni (corresponding to a/m) 
and uncorrected testing (corresponding to a). They also showed that u* is 
strictly less than the optimal p- value cutoff. 



I 
a — 
m 
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Benjamini and Hochberg (2000), in work originally written in 1994, showed 
that the power of the BH (1995) procedure could be improved by estimat- 
ing the number of true null hypotheses. They also proposed an estimator of 
FDR(t) and proposed a threshold T that maximizes the number of rejec- 
tions subject to FDR(r) < a. A similar approach was investigated in Storey 
(2002) and Storey, Taylor and Siegmund (2004). It remains an open question 
whether FDR(T) < a. We address an asymptotic version of this question in 
Section 5. 

The threshold T chosen this way can also be viewed as a plug-in estimator. 
Let 

(7) i(a,G)=sup|i:^^^-^<a 

Suppose we reject whenever the p- value is less than t{a,G). From Genovese 
and Wasserman (2002) it follows that, asymptotically, the FDR is less than 
a. The intuition for (7) is that (1 — a)t/G{t) is, up to an exponentially small 
term, the FDR at a fixed threshold t. Moreover, if G is concave this threshold 
has the smallest asymptotic FNR among all procedures with FDR less than 
or equal to a [cf. Genovese and Wasserman (2002)]. We call i(a, G) the oracle 
threshold. The standard plug-in method is to estimate the functional t(a, G) 
by T = t{a,G), where a and G are estimators of a and G. Let Gm be the 
empirical CDF of P"". Theorem 2 of BH (1995) shows that Tbh = t{0,Gm) 
yields the BH threshold. Benjamini and Hochberg (2000) and Storey (2002) 
showed that T = t{ao,Gm) has higher power than the BH threshold, where 

fn '^m(io) — *o' 
ao = max U, 

V 1 - io 

and to S (0, 1). Clearly, other estimators of a and G are possible and we shall 
call any threshold of the form T = t{a,G) a plug-in threshold. 

We describe alternative estimators of a in Section 3.2. Storey (2002) pro- 
vided simulations to show that the plug-in procedure has good power but did 
not provide a proof that it controls FDR at level a. We settle this question 
in Section 5 where we show that, under weak conditions on a, the procedure 
asymptotically controls FDR at level a. 

2.4. Multiple testing procedures. A multiple testing procedure T is a map- 
ping taking [0, 1]™ into [0, 1], where it is understood that the null hypotheses 
corresponding to all p-values less than T{P^) are rejected. We often call T 
the threshold. 

Let a,t G [0, 1] and < r < m, and recall that P(o) = 0- Let G and g be 
generic estimates of G and g = G' , respectively. Similarly, let P{H = h\P = t} 
denote an estimator of P{H = h\P = t}. 
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Some examples of multiple testing procedures will illustrate the generality 
of the framework: 



Uncorrected testing 


TuiP"") 


= a 


Bonferroni 


TBiP"") 


= a/m 


Fixed threshold at t 


Tt{P"') 


= t 


Benjamini-Hochberg 


TBKiPn 


= sup{t : Gm{t) = t/a] = P(, 


Oracle 


ToiP"") 


= sup{t:G(t) = (l-a)t/Q} 


Plug in 


TpiiPn 


= sup{t:G(t) = (l-a)i/d} 


First r 


T(^r) 


= Pir) 


B ayes' classifier 


TBc{Pn 


= sviv{t:g{t)>l} 



Regression classifier rRcg(P'") = sup{t : P{Hi = l\Pi =t}> 1/2}. 

2.5. FDP and FNP as stochastic processes. An important idea that we 
use throughout the paper is that the FDP, regarded as a function of the 
threshold t, is a stochastic process. This observation is crucial for studying 
the properties of procedures. 

Define the FDP process 



(8) 



r(t) = T{t,p'^,H"') 



J:iHP^<t}il-H, 



J:iHP^<t} + l{allP^>ty 

where the last term in the denominator makes F = when no p- values are 
below the threshold. Also define the FNP process 

j:it{Pi>t}Hi 



(9) 



E{t) = E{t,P'^,H'' 



The FDP and FNP of a procedure T are r(r) 
H(r) = H(r(P"^),P'",if™). Let 



Ei HP^ >t} + H^m. < t} ■ 

r(T(p™),p™,iJ'' 



and 



(10) 
(11) 



Qit) = (1 



Q{t) 



1 



' G{t) 
Fit) 



l-G(t) 
The following lemma is a corollary of Theorem 1 in Storey (2002). 

Lemma 2.1. Under the mixture model, for t > 0, 
Er{t) = Q{t){l-{l-G{t)r), 
EE{t) = Qit){l-G{tr). 

The second terms on the right-hand side of both equations differ from 1 by 
an exponentially small quantity. 
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One of the essential difficulties in studying a procedure T is that T{T) 
is the evaluation of the stochastic process r(-) at a random variable T. 
Both depend on the observed data, and in general they are correlated. In 
particular, if Q{t) estimates FDR(t) well at each fixed t, it does not follow 
that Q{T) estimates FDR(T) well at a random T. The stochastic process 
point of view provides a suitable framework for addressing this problem. 

3. Estimating the p- value distribution. Recall that, under the mixture 
model, Pi,...,Pm have CDF G{t) = {1 - a)t + aF{t). Let G denote an 
estimator of G. Let Gm. denote the empirical CDF of P™. We will use the 
Dvoretzky-Kiefer-Wolfowitz (DKW) inequality: for any x > 0, 

(12) P{||G™(t) - G(i)||oo >x}< 2e-^^-\ 

where ||F - G||oo = supo<j<i \F{t) - G{t)\. Given a £ (0, 1), let 



(13) e^^em{a) = ^^log(^l 

so that, from DKW, P{||Gm(t) - G(i)||oo > £m} < a- 

Several improvements on Gm are possible. Since G >U, we replace any 
estimator Gm with max{Gm (*),*}• When G is assumed to be concave, a 
better estimate of G is the least concave majorant (LCM) Glcm, m defined 
to be the infimum of the set of all concave CDF's lying above Gm- Most p- 
value densities in practical problems are decreasing in p, which implies that 
G is concave. We can also replace GLCM,m with inax{Gi^cM, m{t),t}. The 
DKW inequality and the standard limiting results still hold for the modified 
versions of both estimators. We will thus use G to denote the modified 
estimators in either case. We will indicate explicitly if concavity is required 
or if the LCM estimator is proscribed. 

Once we obtain estimates a and G, we define 

(14) Q(t) = ^i^. 

3.1. Identifiability and purity. Before discussing the estimation of a, it 
is helpful to first discuss identifiability. For example, if a is not identifiable, 
there is no guarantee that the estimate used in the plug-in method will give 
good performance. The results in the ensuing sections show that despite not 
being completely identified, it is possible to make sensible inferences about 
a. 

Say that F is pure if essinfj f{t) = 0, where / is the density of F. Let Op 
be the set of pairs {b, H) such that 6 G [0, 1] , ii' E J" and F = (1 - 6)C/ + hH . 
F is identifiable if Of = {(1, F)}. 
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Define 

CF = ini{b:{b,H)€OF}, 

F-{1-Cf)U 

-= C-F ' 

ap = aC,F ■ 

We will often drop the subscript F on ap and Cf- Note that G can be 
decomposed as 

G = {l-a)U + aF 

= {I - a)U + a[{l - C)U + CF] 

= {l-aC)U + aCF 

= {l-a)U + aF. 

Purity implies identifiability but not vice versa. Consider the following ex- 
ample. Let J- be the Normal (^,1) family and consider testing Hq:9 = 
versus Hi -.O^O. The density of the p- value is 

fg(p) = ie-ri9V2[g-v^e<J.-i(i-p/2) _^ gv^e$-i(i-p/2)]_ 

Now, fe{^) = e~ " > so this test is impure. However, the parametric 
assumption makes a and 9 identifiable when the null is false. It is worth 
noting that fg{l) is exponentially small in n. Hence, the difference between 
a and a is small. Even when X has a i-distribution with u degrees of freedom, 
fe{l) = {l + n6'^/i')~^'^~^^'''^. Thus, in practical cases, a — a will be quite small. 
On the other hand, one-sided tests for continuous exponential families are 
pure and identifiable. 

The problem of estimating a has been considered by Efron, Tibshirani, 
Storey and Tusher (2001) and Storey (2002) who also discussed the iden- 
tifiability issue. In particular. Storey noted that G{t) = (1 — a)t + aF{t) < 
(1 — a)t + a for all t. It then follows that, for any to £ (0, 1), 

(15) 0<ao= , ° <a<a<l. 

1 - to 

Thus, an identifiable lower bound on a is uq. The following result gives 

precise information about the best bounds that are possible. 

Proposition 3.1. If F is absolutely continuous and stochastically dom- 
inates U , then 

C = l-infF'(t) and a = 1 - inf G'(t). 

t ~ t 

If F is concave, then the infinia are achieved at t = l. For any b G [C, 1] we 
can write G = (1 — ab)U + abFi^, where F{, = (F — (1 — b)U)/b is a CDF and 
F<Fb. 
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3.2. Estimating a. Here we discuss estimating a. Related work includes 
Schweder and Spj0tvoll (1982), Hochberg and Benjamini (1990), Benjamini 
and Hochberg (2000) and Storey (2002). 

We begin with a uniform confidence interval for a. 

Theorem 3.1. Let 

/-ic\ ^(*) ~'^~ ^m 
(lb) a^K = max . 

^ J- T 

Then [a* ,1] is a uniform 1 — a confidence interval for a, that is, 

(17) infP„,H«eK,l]}>l-a, 

a,F 

and if one restricts G to he the empirical distribution function, then for each 
{ci,F) pair, 

(18) P„,Ha€K,ll}<l-a + 2£(-iy«(f)'%0(<!^ 

where the remainder term may depend on a and F. Because a>a, [a^:, 1] is 
a valid finite-sample 1 — a confidence interval for a as well. 

Proof. The inequality (17) follows immediately from DKW because 
G{t) > G{t) — £m for all t with probability at least 1 — a. The sum on the 
right-hand side of (18) follows from the closed-form limiting distribution of 
the Kolmogorov-Smirnov statistic, and the order of the error follows from 
the Hungarian embedding. To see this, note that 

G„,{t)-G{t) ^G{t)-t EmV^ 
a<a.f =^ a\/m < max ^^m \- \jm — 

/— G^{t)-G{t) /— £m\/rn 

=^ a\/m < max \/m \- Wma 

t l-t 1-t 

t{t)-G{t) Emy/m 



=^ < max \/rn- 
t 

=^ < may: ^/m{Gm{t) — G{t)) — £mV^ 

=^ \\^/rn{G^{t) - G{t))\\oc > EmVrn. 
Hence, 

(19) P{a < a*} < P{|| V^(G,„(i) - G(i))||oo > emV^}- 

Next apply the Hungarian embedding [van der Vaart (1998), page 269]: 



m 



limsup- -K \\\^{Gm - G) - MmWoc < oo a.s., 

m^oo (log 771 j^ 
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for a sequence of Brownian bridges B^ • Recall the distribution of the Kolmogorov- 
Smirnov statistic: 

oo 

P{||B||oo>x} = 2^(-ly■+V2^"-^ 
i=i 

for a generic Brownian bridge B. The result follows by taking x = y/mEm- 
In the concave case, the LCM can be substituted for G and the result still 
holds since, by Marshall's lemma, UGlcm,™ — G||oo < \\Gm — G\\oo- D 

Proposition 3.2 (Storey's estimator). Fix to G (0,1) and let 

'rniio) — to^ 



ao 



I -to 



IfG{to)>to, 



and 



, P G{to) -to 

ao -^ — --. : = ao<a, 

i -to 



-, . Gito)-to \ ( G(to)(l - G{to)) 



1-to ; V ' {i-to? 

IfG{to)=to, 

y/rnao -^ -So + -N'^ \ 0, 



2 " 2 \ ' 1-to, 
where 6o is a point-mass at zero and N'^ is a positive-truncated normal. 

A consistent estimate of a is available if we assume weak smoothness con- 
ditions on g. For example, one can use the spacings estimator of Swanepoel 
(1999) which is of the form 2rm/{'n^Vm)^ where r^ = m'^'^ {logm)~ and Vm 
is a selected spacing in the order statistics of the p- values. 

Theorem 3.2. Assume that at the value t where g achieves its mini- 
mum, g" is bounded away from and oo and Lipschitz of order A > 0. For 
every 6 > 0, there exists an estimator a such that 



(logm)'^ 



(a-a)-^ N{0, (1 - a) 



Proof. Let a be the estimator defined in Swanepoel (1999) with r^ = 
?TT-^'^(logm)~^ and Sm = m'^'^{logm) . The result follows from Swanepoel 
[(1999), Theorem 1.3]. D 
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Remark 3.1. An alternative estimator is a = 1 — niinj^(t), where g is 
a kernel estimator. 

Now suppose we assume only that G is concave and hence g = G' is 
decreasing. Hengartner and Stark (1995) derived a finite-sample confidence 
envelope ['^~{,-),"f~^{-)] for a density g assuming only that it is monotone. 
Define 

^HS = 1 ~ min{/i(l) :7~ <h< 7 }. 

Theorem 3.3. If G is concave and g = G' is Lipschitz of order 1 in a 
neighborhood of 1, then 

n \V3 

(am -QJ^O. 



logn 

Also, [1 — 7+(l), 1 — 7~ (1)] is al — a confidence interval for a for < q < 1 
and all m. Further, 

infP{aG[l-7+(l),l]}>l-a, 

a,F 

where the infimum is over all concave F^s. 

Proof. Follows from Hengartner and Stark (1995). D 

3.3. Estimating F. It may be useful in some cases to estimate the alter- 
native mixture distribution F. There are many possible methods; we con- 
sider here projection estimators defined by 

(20) F„, = argmin||G-(l-a)C/-aiJ||oo, 

where a is an estimate of a. The Appendix gives an algorithm to find F^. 

It is helpful to consider first the case where a is known, and here we 
substitute a for a in the definition of Fm- 

Theorem 3.4. Let 

Fm = arg min ||G — (1 — a)U — aH\\oa. 

Then 

WF-F II < II '^lloo a.s 

K -* m||oo ^ ' 'J' 
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Proof. 

rt||/?_7? II _||„7?_„/? II 

= \\{l - a)U + aF - {1 - a)U - aPmWoo 

= \\G - {1 - a)U - aF^Woo 

= \\G-G + G-il-a)U-aF^\\^ 

< \\G - GIloo + ||G - (1 - a)U - aFmWoo 

< \\G - G\\oo + ||G - (1 - a)U - aFlloo 
= \\G — G||oo + \\G — G||oo- 

The last statement follows from the uniform consistency of G. D 

When a is unknown, the projection estimator F is consistent whenever 
we have a consistent estimator of a. Recall that in the identifiable case a = a 
and F = F. 

Theorem 3.5. Let a be a consistent estimator of a. Then 

\\F -F\\ < l|g-g|U + l«-^l ^,0 

Q. 

Proof. Let 6m = \\G — (1 — a)U — aF||oo- Since F is the minimizer, 

Sm < \\G - {1 - a)U - aF\\^ 

= \\G - G + {a - a)U - {a - a)F\\^ 
< \\G — G\\oo + \a — a\ 



0. 



We also have that 



Sm > |||G - (1 - a)U - aFlloo - a||Z - i^llool- 

Since 6m and \\G — {1 — a)U — aF||oo ^0 by the above and a^a, it follows 
that \\F — FWoo^O. Moreover, 



\F — F\\ < 



\G — G\\oo + |a — a| 



D 
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4. Limiting distributions. In this section we discuss the hmiting distri- 
bution of r and Q. Let 

Mt) = -y^{l-Hi)l{P,<t} and Ai(t) = -Vi?il{P.<t}, 
m ^ m ■f^ 

and, for each c G (0, 1), define 

n,{t) = (1 - c)Ao(t) - cAi(t) = 1 ^ A(t), 

where A(t) = l{i^i < t}{l -Hi- c). Let 

/ic(i) = EDi(t) = (1 - a)t - cG{t). 

Let (Wo, W^i) be a continuous two-dimensional mean zero Gaussian process 
with covariance kernel Rij{s,t) = Cov {Wi{s),Wj{t)) given by 

■(l-a)(sAt)- {l-afst -{l-a)saF{t) 

-{l-a)taF{s) aF{sAt)-a'^F{s)F{t)^ 



(21) R{s,t) 



Theorem 4.1. Let W be a continuous mean zero Gaussian process with 
covariance 

Kn{s,t) = (1 - a)(l - c)[(l - c)(s A t - (1 - a)st) + ac{tF{s) + sF{t))] 
(22) 

+ ac[cF{s At)- acF{s)F{t)]. 

Then 

y/m{nc - He) -^ W. 

Proof. Let 

Zm{t) = V^{n,{t) - ^^,{t)) and Z*^{t) = ^{^Ht) - ji,{t)) 

forte [0,1]. Let 

{WmAt)^Wm,i{t)) = (V^(Ao(t) - (1 - a)t), V^(Ai(t) - aF{t))). 

By standard empirical process theory, {Wm,o{t) ,Wm,i{t)) converges to (Wq, Wi) 
The covariance kernel R stated in (21) follows by direct calculation. The re- 
sult for Qc is immediate since Qc is a linear combination of Aq and Ai. 

D 

Theorem 4.2 (Limiting distribution of FDP process). For t G [6, 1] for 
any 6 > 0, let 

Z^{t) = V^{T„^{t)-Q{t)). 
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Let Z he a Gaussian process on (0, 1] with mean and covariance kernel 

^ il-a)stF{sAt) + aFis)Fit){sAt) 
Kr{s,t) = a{l-a) G^Wit) ' 

Then Z^ ~~^ Z on [5, 1] . 

Remark 4.1. The reason for restricting the theorem to [6, 1] is that the 
variance of the process is infinite at zero. 



Proof of Theorem 4.1. Note that T^{t) = Aoit)/{Ao{t) + Ai{t)) = 
r(Ao, Ai), where Aq and Ai are defined as before and r(-, •) maps i°° x i°° ^ 
£°°, where £°° is the set of bounded functions on (6, 1] endowed with the sup 
norm. Note that r((l — a)U, aF) = Q. It can be verified that r(-, •) is Frechet 
differentiable at ((1 — a)U,aF) with derivative 

_ aFVo - (1 - a)UVi 

where U{t) = t, V = (Vb,Vi). Hence, by the functional delta method [van 
der Vaart (1998), Theorem 20.8], 

, ..... aFWo-{l-a)UW^ 

Zm -^ ?-((l_a)C/,aF) (^) = q2 ' 

where (Wo, W^i) is the process defined just before (21). The covariance kernel 
of the latter expression is KY{s,t). D 

Remark 4.2. A Gaussian limiting process can be obtained for FNP 
[i.e., H(t)] along similar lines. 

The next theorems follow from the previous results followed by an appli- 
cation of the functional delta method. 

Theorem 4.3. Let Q{t) = (1 - a)t/G{t). For any 5 > 0, 

y/^{Q{t)-Q{t))^W 

on [J, 1] , where W is a mean zero Gaussian process on (0, 1] with covariance 
kernel 

T. / N r^, ^r^, ^G(s^t)-Gis)Git) 

Theorem 4.4. Let Q{t) = (1 - a)t/G{t). We have 
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where W is a mean zero Gaussian process with covariance kernel 

KQis,t) 



Kq-i{u,v) 



Q'{s)Q'{t) 

(l-a)W Gis^t)-Gis)Git) 



[I- a- ug{s)] [I- a- vg{t)] ' 
withs = Q~^{u) and t = Q~^{v). 

Theorem 4.5. Let Q{t) = (1 — ao)t/G{t), where oq is Storey's estima- 
tor. Then 



V^{Q{t) - Q{t)) - W, 

where W is a mean zero Gaussian process with covariance kernel 



K{s,t) 



t^ 



[l-t^YG\s)G\t) 

X {G{s)G{t)to{l - to) + G(t)(l - G{tQ))R{sM) 
+ G(s)(l - G{to))R{t, to) + (1 - G{to)fR{s, t) 
where R{s, t) = sAt — st. 

5. Asymptotic validity of plug-in procedures. Let Q~^{c) = sup{0 <t< 
1 : Q{t) < c}. Then the plug-in threshold Tpi defined earlier can be written 
Tpi{P"^) = Q~^{a). Here we establish the asymptotic validity of Tpi in the 
sense that ET(T) <a + o(l). First, suppose that a is known. Define 

(23) Qa{t) = fc^ 

G[t) 
to be the estimator of Q when a is known. 

Theorem 5.1. Assume that a is known and let Q = Qa- Let to = Q~^{a) 
and assume G ^U . Then 



V^{Tpi-to)^N{0,KQ-i{to,to)), 
V^iQiTpi) - a) ^ N{0,{Q' {to)fKQ-i{to,to)), 
and 

Er(rpi) = a + o(i). 
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Proof. The first two statements follow from Theorem 4.4 and the delta 
method. 

For the last claim, let < 6 <to, write T = Tpj and note that 

|r^(T) -a\< iTmiT) - Q{T)\ + \Q{T) - a\ 

<sup\Tm{t)-Q{t)\l{T<6} 
t 

+ sup |r„(t) - Q{t)\l{T >5} + \Q{T) - a\ 
t 

< 1{T <5}+ sup \Tmit) - Qit)\ + \Q{T) - a\ 

t>5 

= 1{T<6} + ^ sup| V^(r„(t) - Q(t))| + \Q{T) - a\ 
\Jm t>5 



Op{m 



-1/2N 



Because < T^ < 1, the sequence is uniformly integrable, and the result fol- 
lows. 

D 

Next, we consider the case where a is unknown and possibly nonidentifi- 
able. In this case, as we have seen, one can still construct an estimator that 
is consistent for some value ao < a. 

Theorem 5.2 (Asymptotic validity of plug-in method). Assume that G 

is concave. Let T = t{a, G) be a plug-in threshold where G is the empirical 

p 
CDF or the LCM and a— > oq for some uq < a. Then 

Er(r)<a + o(i). 

Proof. First note that the concavity of G implies that Q{t) = (1 — 
a)t/G(t) is increasing. Let 5 = (a — ao)/(l — a) so that (1 — ao)/(l — a) = 1 + 6. 
Then 

-, , (l — a)t l — a,_ ^x A / N 

^ * = h^ = -^ (^ + '^^»(*) 

G{t) 1 - ao 

= {l + 0p{l)){l + 5)Qa{t), 

where Qa is defined in (23). Hence 

T = Q~\a)=Q~'[r^^+op{l] 
<Q-\a + op{l)) = Q-\a)+op{l). 
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Because Q~^ ~^ Qao ^^^ because Qagioi) < Qa^i'^)i the result follows from 
the argument used in the proof of the previous theorem using Qa^ in place of 

Qa. 

D 

Recall that the oracle procedure is defined by To{P^) = Q~^{a). This 
procedure has the smallest FNR for all procedures that attain FDR < a 
up to sets of exponentially small probability [cf. Genovese and Wasserman 
(2002), page 506]. In the nonidentifiable case, no data-based method can 
distinguish a and a, so the performance of this oracle cannot be attained. We 
thus define the achievable oracle procedure Tao to be analogous to Tq with 
(1 — a)t/G{t) replacing Q. The plug-in procedure that uses the estimator a 
described in Theorem 3.2 asymptotically attains the performance of Tao in 
the sense that Er(rpi) = a + o(l) and EH(rpi) = EH(rAo) + o(l). 

6. Confidence envelopes for FDP. Because the distribution of the FDP 
need not be concentrated around its expected value, controlling the FDR 
does not necessarily offer high confidence that the FDP will be small. As an 
alternative, we develop methods in this section for making inferences about 
the FDP process. 

A 1 — a confidence envelope for the FDP process is a random function F 
on [0, 1] such that 

P{F(t) < T{t) for all t}>l-a. 

In this section we give two methods for constructing such a F, one asymp- 
totic, one exact in finite samples. See also Havranek and Chytil (1983), 
Hommel and Hoffman (1987) and Halperin, Lan and Hamdy (1988). 

Besides being informative in its own right, a confidence envelope can be 
used to construct thresholds that control quantiles of the FDP distribution. 
We call T a 1 — a confidence threshold if there exists a statistic Z such that 

P{F(r) <Z}>l-a. 

We consider two cases. In the first, called rate ceiling confidence thresholds, 
we take Z to be a prespecified constant c (the ceiling). The thresholds we de- 
velop here are derived from a confidence envelope F as the maximal threshold 
such that F < c. In the second, called minimum rate confidence thresholds, 
the threshold is derived from F by T = argmin(F(t) and Z = T(T). 

When a is known, it is possible to construct an asymptotic rate ceiling 
confidence threshold directly. 

Theorem 6.1. Let tc = Q^^{c) and let Kfi{s,t) be the covariance kernel 
defined in (22). Assume that F ^U. Define 



tc,m{Oi) =f 



^Jm 1 — o — cg(tc 
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Then 

P{T{tc,m) < c} = 1 - a + 0{m-^'^). 

Proof. We have 

P{r(tc,m) < C} = P{J^c(tc,m) " /i(tc,m) < -^^{tc,m)] 



P( V^^Mf) < -^^^41+ 0(1), 



from Lemma 6.1. It suffices, in light of Theorem 4.1 and Lemma 6.1, to show 
that 

Now fi{tc) = (1 — a)tc — cG(tc) = since Q(tc) = c. Hence 
fi{t) = {t-Qfi'{Q + o{\t-tc\) 

= {t-tc){l-a- cgiQ) + o{\t - tc\). 
Hence 

fJ-{tc,m) = {tc,m -tc){l-a- cg{tc)) + o(m~^/^). 
The result follows from the definition of tc^^n- D 

Lemma 6.1. Let tc = Q^^{c), and assume < tc < 1- If tc,m — tc = 
0{m-'^/^), flcitcm) - Kic,m) =^c{tc)+op{m-^/^). Thus, ifurn = vm-^/^ + 
o{m~'^i'^) for some v, 

P{^c(tc,m) < IJ'(tc,m) + Um} - P{^c{tc) < n-m} = o(l). 

Proof. Note that fi{tc) = (1 - a)tc - cG{tc) = and that 

\^c{tc,m) - ^c{tc)\ < max{c, 1 - c}m-^ ^ \l{Pi < tc,m} - t{Pi < tc}\ 

i 

<\G{t,^m)-G{Q\, 

which is Binomial(?T2, |G(tc,m) — G{tc)\)/m and has variance of order m~^''^. 
Hence 
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The second claim is immediate. D 

However, when a is unknown, there is a problem. When plugging in a 
consistent estimator of a that converges at a suh-y/m rate, the error in 
a is of larger order than tc — tc,m- Using an estimator, such as Storey's 
estimator, which converges at a l/^/m rate but is asymptotically biased, 
causes overcoverage because the asymptotic bias dominates. Interestingly, 
as demonstrated in the next section, it is possible to ameliorate the bias 
problem, but not the rate problem, with appropriate conditions. Thus, a 
"better" estimator of a need not lead to a valid confidence threshold. 

6.1. Asymptotic confidence envelope. In this section, we show how to 
obtain an asymptotic confidence envelope for T, centered at Q. Throughout 
this section we use G based on the empirical distribution function, not the 
LCM. 

For reasons explained in the last section, we use Storey's estimator rather 
than the consistent estimators of a described earlier. That is, let clq = 
(G'(to) — *o)/(l — *o) be Storey's estimator for a fixed to G (0, !)• Then 

g(i)_(l-«o)i_l-G(to) t 



G{t) 1 - to G{t) 

To make the asymptotic bias in Storey's estimator negligible, we make 
the additional assumption that F depends on a further parameter i^ = v(m) 
in such a way that 

(24) F^(t)>l-e-'^^W 

for some c{t) > 0, for all < t < 1. The marginal distribution of Pi becomes 

Gm = {l-a)U + aF^i^^^) . 

This assumption will hold in a variety of settings such as the following: 

1. The p- values Pi are computed from some test statistics Zi that involve 
a common sample size n, where the tests all satisfy the standard large 
deviation principle [van der Vaart (1998), page 209]. In this case v = n. 

2. As in the previous case except that each test has a sample size Ui drawn 
from some common distribution. 

3. Each test is based on measurements from a counting process (such as an 
astronomical image) where u represents exposure time. 

Under these assumptions, we have the following theorem. 
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Theorem 6.2. Let tm be such that t^ ^ and mtm/i^ogm)^ — > oo. Let 
Wa/2 denote the upper a/2 quantile o/maxo<j<iB(t)/\/t, whereM{t) denotes 
a standard Brownian bridge. Let 

(25) Am = maxl2{l-ao)wa/2, ^_ \r°^[~) (■ 
Define 

(26) r(t) = min|Q(t) + ^^,lj. 
Assume that 

27 -^^ ^ oo 

iogm 

as m^ oo. Then 

(28) limjnf P{r(t) < T{t) for all t>t^}>l-a. 

Proof. Let 

N{t) = ^^^ = - E(i - H^)Hn < t}. 

m m ^ 

4 = 1 

Note that ^{N{t)) = (1 - a)t and Cov(iV(t),iV(s)) = (1 - af{s A t - st). 
By Donsker's theorem, ^/rnlN(t) - (1 - a)t) -w (1 - a)B(t), where B(t) is 
a standard Brownian bridge. By the Hungarian embedding, there exists a 
sequence of standard Brownian bridges Bm(t) such that 

A'(t) = (1 - a)t H ^ h i?m(t), 

where 

Rm^snp\Rmit)\ = 0{ ^ ^ M a.s. 
t \ m J 

Let 

(29) y(t) = (i_ao)t+^™^ 



/?7i 

Now, 

P{iV(t) > y(t) for some t > t„J 
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> (1 — do)t H T^ foi' some t > t 

(30) 



< p(max(V^|ao - a\Vi) > ^] + p((l - a)max^^> ^ 



^1 (loerTT,^^ 



The last term is o(l) since mtm/{^ogm)'^ — > cxd. 
Let 

G'(io) - to Pu(m) (*o) - h 
"° = ^^^ = " 1-to • 
Then 

"-'^°='^ 1-to -^^rr 

By assumption, we can write 

Srn log m 



v{m) 



c{to) 

for some Sm — > oo. Hence a — uq = 0{m~'^^). In particular, a — ao = o(-^). 
Hence 

-v/m|oo — a\< \/m\aQ — ao| + V^|«o — ^l = \/^|ao — oo| + o{l). 
Thus 

P< max(-v/m|ao — a\vt) > —^ > 
= PWm\ao-a\ > — > 
= P<^ Vm|ao - aol > -^ \ + o(l) 

^ p r \HG(tol^G^nM > ^\ + o(l) 

I 1 — ^0 2 J 

= p{|G(to)-G„(to)|>^^^^^} + o(l) 

<2exp|-m^il^^Uo(l) 
i 4 m J 

(31) <f + o(l) 
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by the DKW inequality and the definition of A^. 

Fix e > 0. Since ao -^ ao, we have, almost surely for all large m, that 

Am ^ 2(1 - ao)Wa/2 



2(1 -a)- 2(1 -a) 



1 — flg 1 — '^O / 

Wa/2 = -. (1 + 0{l))Wa/2 > Wa/2 " £• 



-I — ULf ^ -I 

1 — a 1 — Go 

Let Wm{t)=Mmit)/Vi. Then for ah large m 

Y' 



p((l-a)maxW„(t)> 

= p(maxW,n(t)>-^^l 
[t>tm 2(1 -a) J 

< pi max Wm(i) > Wa/2 - e \ 

< P\ max Wm(t) > Wa/2 - e 



+ P I Wa/2 - e < mc« Wm(t) < U;c»/2 | 



a 



= -^ + P| ^a/2 - £ < max W„,(t) < Wa/2 

Since e is arbitrary, this implies that 

(32) limsupP((l-a)maxW„(t)>^^) <-. 

From (31), (32) and (30) we conclude that 

limsup P(iV(t) > V{t) for some t>tm)< a. 

Notice that r(t) = N{t)/G{t). Hence Af(t) < V{t) implies that 

r(t)<S^ = m). 

The conclusion follows. D 

Both types of confidence thresholds can now be defined from F. For ex- 
ample, pick a ceiling < c < 1 and define Tc = sup{t > tm. '■ F(t) < c}, where 
Tc is defined to be if no such t exists. The proof of the following is then 
immediate from the previous theorem. 
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Corollary 6.1. Tc is an asymptotic rate ceiling confidence threshold 
with ceiling c. 

It is also worth noting that we can construct a confidence envelope for 
the number of false discoveries process Mi|o(t). 

Corollary 6.2. With tm as in the above theorem and V{t) defined as 
in (29), 

(33) liminf P{Mi|o(t) < mV(t) for t>tm}>l-a. 

6.2. Exact confidence envelope. In this section we will construct confi- 
dence thresholds that are valid for finite samples. 

Let < a < 1. Given yi,...,Vfc, let (/^^(ui, . . . ,ffc) be a nonrandomized 
level a test of the null hypothesis that Vi,...,Vk are drawn i.i.d. from 
a Uniform(0, 1) distribution. Define p'^{h^) = {pi'- hi = 0,1 < i < m) and 
mo{hn = El^i(l - h) and U^ip^ = {h^ G {0, 1}- :(^^„(,™)(p^(/i™)) = 
0}. Note that as defined, Ua always contains the vector (1,1,...,!). 

Let 

(34) G^ip"^) = {r(.,/i™,p™) :/i™ GW«(p™)}, 

(35) M^{pn = {mo(/i'") : h^ G U^ip^}- 

Then we have the following theorem, which follows from standard results on 
inverting hypothesis tests to construct confidence sets. 

Theorem 6.3. For all < a < 1,F, and positive integers m, 

(36) Pa,H^"^G^/,(P'")}>l-a, 

(37) Pa,HMoG-Ma(^™)}>l-«, 

(38) PaA^{-^H^^Pn^Qc.}>l-a, 

(39) Pa,F{r(re) < c} > 1 - a, 
where 

(40) rc = sup{t:r(t;/i'",P'")<c and h"" GUaiP"")}. 
In particular, 

(41) r(t) = sup{r(t):rGg„(p-)} 

is a 1 — a confidence envelope for T, and Tc is a 1 — a rate ceiling confidence 
threshold with ceiling c. In fact, inf^^F Pa,F{r(t) < r(t), for all t} > 1 — a. 
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Remark 6.1. If there is some substantive reason to bound Mq from 
below, then Q^ will have a nontrivial lower bound as well. In general, because 
Ua always contains (1,1,...,!), the pointwise infimum of functions in Qa will 
be zero. 

Remark 6.2. At first glance, computation olUa would appear to require 
an exponential-time algorithm. However, for broad classes of tests, including 
the Kolmogorov-Smirnov test, it is possible to construct Ua in polynomial 
time. 

Remark 6.3. The choice of test can be important for obtaining a good 
confidence envelope. A full analysis of this choice is beyond the scope of 
this paper; we will present such an analysis in a forthcoming paper. In the 
examples below, we use the test derived from the second-order statistic of a 
subset of p- values. 

Remark 6.4. A similar construct yields a confidence envelope on the 
process Mi|o(t). 



6.3. Examples. 

Example 1. We begin with a re-analysis of Example 3.2 from BH 
(1995). BH give the following 15 p- values 

0.0001 0.0004 0.0019 0.0095 0.0201 0.0278 0.0298 0.0344 
0.0459 0.3240 0.4262 0.5719 0.6528 0.7590 1 

and at a 0.05 level Bonferroni rejects the first three null hypotheses and the 
BH method rejects the first four. 

Because m is small, we construct only the exact confidence envelope for 
this example. Figure 1 shows the upper 95% confidence envelope on the 
FDP for these data using the second-order statistic of any subset as a test 
statistic for the exact procedure. Notice first that the confidence envelope 
never drops below 0.05. Second, while the BH threshold T = Pu\ = 0.0095 
guarantees an FDR < 0.05, we can claim that P{r(i-*(4)) > 0.25} < 0.05, but 
this is also true for the larger threshold P([i) = 0.4262~ , which will have 
higher power. This difference occurs because the envelope takes large values 
at small thresholds. The result could be quite different with another choice 
of test statistic. The minimum rate 95% confidence threshold has T = 0.324 
and Z = r(r)= 0.111. 
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Example 2. We present a simple, synthetic example, where m = 1000, 
a = 0.25, and the test-statistic is from a Normal(0,l) one-sided test with 
Ho:e = OandHi:e = 3. 

Figure 2 compares the true FDP sample path with the 95% confidence 
envelopes derived from the exact and asymptotic methods. For small values 
of the threshold the exact envelope almost matches the truth, but for larger 
values it becomes more conservative. The asymptotic envelope remains above 
but generally close to the truth. The asymptotic and exact envelopes cross 
at an FDP level of about 0.05. The rate ceiling confidence thresholds with 
ceiling 0.05 and level 0.05 are 0.00062 for the asymptotic and 0.00046 for 
the exact. The minimum rate confidence threshold for the exact procedure 
has T = 0.00039 and Z = 0.011. 

APPENDIX 

Algorithm for finding i^. Here we restrict our attention to the case in 
which we take F as piecewise constant on the same grid as G. When F is 
concave, the algorithm works in the same way with the sharper piecewise 
linear approximation. 




Fig. 1. Plot of r(f) versus t for Example 1, where T is derived from the exact method 
of Section 6.2. The leftmost dot on the horizontal axis is the BH threshold; the rightmost 
dot is a confidence threshold with the same ceiling. 
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Step 0. Begin by constructing an initial estimate of F that is a CDF. 
For example, we can define H to be the piecewise constant function that 
takes the following values on the Pj's: 






G(Po-))-(l-a)P(,) 



Step 1. Identify the segment with the biggest absolute difference be- 
tween G and {1 — a)U + aH . 

Step 2. Determine how far and in what direction (up or down) this 
segment can be moved while keeping H a CDF and minimizing ||G — (1 — 

a)U + ai^lloo- 

Step 3. If the segment can be moved, move it and go to Step 1. Else 
go to Step 4. 

Step 4. If no segment can be moved to reduce \\G — {1 — a)U + aH\\oo, 
STOP. 
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Fig. 2. Plot of the true F sample paths and T for the exact (cf. Section 6.2) and asymp- 
totic (cf. Section 6.1) methods for the data in Example 2. The envelopes are shown here 
only for small thresholds. The truth (solid) is the lowest curve over the entire domain. 
The exact envelope (dashed) begins near 1, dips toward the truth and then rises sharply. 
The asymptotic envelope (dotted) is the other curve. 
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If the current segment is part of a contiguous block of segments where one 
segment in the block can be moved to reduce ||G — (1 — a)U + aff ||oo, move 
the segment at the end of the contiguous block of segments that provides 
the greatest reduction in \\G — {1 — a)U + aH\\oo- Go to Step 1. 
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